Nur Shlapobersky + Sage Voorhees
SES 5394: Travel Behavior and Forecasting
Spring Semester 2023
Figure 1: Sketch of Oklahoma City boundaries and interstate highways.
Oklahoma City is the capital of Oklahoma and the largest city in the state. Three major interstates–I-35, I-40, and I-44 all pass through OKC. As of the 2020 Census, the OKC Metro area is majority white, with a population just shy of 1.5 million people.1 2020 Census and 2021 American Community Survey
| Race | Percent of Population |
|---|---|
| White | 62% |
| African American | 10% |
| Native American | 3% |
| Asian | 3% |
| Multi-racial | 8% |
| Other | 1% |
| – | – |
| Hispanic | 14% |
Some well-known neighborhoods in OKC include
Figure 2: The Bricktown neighborhood.
For this analysis we are using a classic four-step travel demand model. Because this is a student assignment with limited capacity, we are using various shortcuts throughout the model that will be identified. Shortcomings with the way we have modeled transportation do not necessarily reflect shortcomings of the four-step model.
Figure 3: Overview of the four-step model
Figure 4: Number of households by census tract
Our transportation analysis looks at 419 transit analysis zones across 7 counties, each corresponding to a census tract. In Figure 4 we can see that Oklahoma City follows a typical greater metropolitan area pattern with a dense and active urban core, surrounded by suburbs and rural areas. The
The longest distance between zones by car was just over 3 hours and 15 minutes (190.5 minutes). The shortest distance was half a minute (0.5 minutes). The average distance between TAZ centroids is roughly 30 minutes (30.7), the median time is around 25 minutes (25.6 minutes). Roads highlighted in red in Figure 5 were modeled as two-way rural roads.
Figure 5: The modeled Oklahoma City road network.
Figure 6: Full county map with public transit.
The OKC Transit network is composed of 651 miles of bus routes, across 30 different bus lines. The map below shows the bus network in detail, and in the context of the whole city. Of our 419 transit analysis zones for OKC metro area, the transit network connects only 135 of those zones, with the longest travel time between zones being just over 3 hours and 15 minutes (190.5 minutes). The shortest distance was half a minute (0.5 minutes). The average distance between centroids is roughly 30 minutes (30.7). The median time between centroids was around 25 minutes (25.6 minutes). The public transit is fully contained in 3 of the 7 counties that make up the OKC statistical area.
Figure 7: The public transit network in Oklahoma City.
Using the networks we created, we generated travel time skims which provide travel times between every TAZ. By selecting a subset we can map every zone’s travel time by car to the University of Oklahoma, as in Figure 5.
Figure 8: Travel time by car from the University of Oklahoma
As well as the travel time by bus from the University to other zones, as in Figure 6 (note that many are grayed out because they cannot be reached by bus).
Figure 9: Public transit travel time to the University of Oklahoma
Accessibility is a measure of how many destination travelers can reach within a perceived reasonable time using transportation modes available to them. Put an alternative, and slightly more mathematical way:
Mobility: reasonable reachable area Proximity: opportunities per area \[accessibility = mobility * proximity\]
We determine accessibility based on the network skims mentioned earlier and employment data. Travel times are used in a decay function to scale the “worth” of each opportunity, and these are all summed together to determine the accessibility score. See Appendix B for more information.
Car access is distributed as is typical for a metropolitan area: the downtown, being both dense and centrally located, has higher scores than the outlying areas.
Figure 10: Car accessibility scores for each zone
While they don’t necessarily represent a large percent of the land area, there are many of those downtown high-scoring zones because they are smaller, and this is what forms the right peak in the distribution shown in Figure 11. The left peak represents the outlying rural zones.
Figure 11: Distribution of zone accessibility scores
Transit in Oklahoma City is quite limited to the areas in and around Downtown and the University campus. The bus lines between the two areas notably bypass most of the zones in between, creating the two island-like regions in Figure 12. Taking a look at the linear scale accessibility map, we can see that the majority of those zones have very similar low scores. There are just a few outliers with much higher accessibility scores due to the proximity of transit hubs where many of the bus lines meet.
Figure 12: Transit accessibility scores for each zone (on a log scale and a linear scale)
Those outliers can also be seen in Figure 13 at the far right tail of the distribution.
Figure 13: Distribution of zone accessibility scores
To generate Trip Attractions and Trip Productions for each transit analysis zones, we broke up trip types into three main categories.
To generate our trip attractions and productions we conducted a linear regression using factors present in both the NHTS2 National Household Travel Survey (NHTS) from 2017 and the ACS.
We used:
We had a very low R-Squared value in our regressions ranging from .124 to .129. In our regressions, only household size and presence of kids was statistically significant. The resulting trips by type were as follows:
| Trip Type | Total Trips |
|---|---|
| Home Based Work | 922,115 |
| Home Based Other | 2,829,372 |
| Non Home Based | 2,690,824 |
Figure 14: Home Based Other, Trip Productions and Attractions
Figure 15: Non-Home Based, Trip Productions and Attractions
We also used NHTS data to examine mode share in OKC based on various trip types.
Figure 16: Mode share in the OKC metro area
Figure 17: Trip purpose by mode share in Oklahoma City metro area.
Figure 18: Detailed trip purpose and mode share in Oklahoma City Metro Area
Trip distribution is calculated through a gravity model. To build our gravity model and calculate our trip attraction/production matrix, we first had to choose friction functions for all three trip types. We chose to use power functions, with coefficients adjusted to cause our model’s average travel times to match the NHTS travel data.
For home-based work trips, \(friction = travelTime^{-3}\)
For home-based other trips, \(friction = travelTime^{-0.5}\)
For non-home-based trips, \(friction = travelTime^{-2.9}\)
This produced average travel times that closely matched observed data, as shown below.
| Trip Type | Observed Average | Model Average |
|---|---|---|
| Home-Based Work | 26.1019 | 26.1546 |
| Home-Based Other | 16.3041 | 16.3455 |
| Non-Home-Based | 15.9448 | 15.8869 |
We can see that the single county which generates the most trips by far (across all three types) originate in Cleveland county. This county includes the University of Oklahoma and a large amount of residential land. The vast majority of those trips end in Oklahoma County, which contains the Downtown area. Both the desire line plot and Chord diagrams below illustrate this.
Figure 19: Desire lines plotted between counties in Oklahoma City.
Figure 20: Chord diagram for home-based work trips.
Figure 21: Chord diagram for home-based other trips.
Figure 22: Chord diagram for non-home-based trips.
For this step of the model we began by generating costs for our three forms of transportation: driving alone (SOV), driving with someone else (HOV), and taking transit. Although our transit skim contained fare information, we chose to instead use information from the National Transit Database as we thought it would better reflect information about transit discounts. We then made the assumption that transit cost is a function of the baseline fare cost multiplied by the number of transfers. For travel by car we used NHTS data to first find total expenditure on gas and total driving time. Using these two numbers we generated cost per minute of driving. We then used information from table 4.16 of NCHRP 716 to estimate average occupancy of vehicles for our different categories of trip types (Home Based Work, Non-Home Based, Home-Based Other). We assume that driving cost is shared equally among all car occupants.
| Transportation Mode | Transportation Cost |
|---|---|
| Transit Base Fare | $ 0.68 |
| SOV | $0.07 per mile |
| HOV (HBO) | $0.025 per mile |
| HOV(NHB) | $0.025 per mile |
| HOV (HBW) | $0.025 per mile |
In our process of choosing mode choice models from NCHRP 716 we went with two criteria 1. Using a Nested Logit when possible 2. Using models that we had all values for
We chose the following models:
| Trip Type | Model Chosen | Assumptions of Model |
|---|---|---|
| Home Based Work | Model G | Nested; > 1 million; Excludes non-motorized; Submodes for HOV/SOV |
| Home Based Other | Model G | Non-nested; > 1 million; Excludes non-motorized; No submodes |
| Non Home Based | Model G | Non-nested; > 1 million; Excludes non-motorized; Submodes for HOV/SOV |
Next we calculated utility for the different modes. This allows us to encode how much someone’s utility of a trip depends on characteristics of that trip. For example, how long someone has to wait for a bus decreases their perceived utility of transit, and how long someone has to drive decreases their utility of driving.The NCHRP models give us coefficients (how different aspects such as waiting time influence utility) but they don’t provide mode-specific constants which would tell us how the utilities of modes relate to each other. To estimate the utility of each mode we started by using the log-odds to generate total mode share for the region.
We then used the calculated utilities in a probability model which generated the mode-share distribution for the region. By using the flow data generated by the gravity model described in the previous section, we were able to calculate the total ridership of our three different modes. Once we generated an initial estimate we were able to adjust our mode share coefficients to match the observed NHTS data, as shown in the table below:
| HBO | HBO Model | HBW | HBW Model | NHB | NHB Model | |
|---|---|---|---|---|---|---|
| pct_SOV | 0.430 | 0.431 | 0.905 | 0.903 | 0.510 | 0.514 |
| pct_HOV | 0.5416 | 0.5403 | 0.0716 | 0.0749 | 0.4771 | 0.4715 |
| pct_transit | 0.0286 | 0.0282 | 0.0231 | 0.0213 | 0.0125 | 0.0147 |
To begin the Trip Assignment step we first did some manipulations on our production-attraction matrix to convert the information from person based (how many people traveling between zones) to a vehicle based (how many cars traveling between zones). To do this, we divided the number of HOV trips by the predicted average carpool size for each type of trip. We used the same averages that we used in the Mode Choice step which came from table 4.16 of NCHRP 716. We then summed the three separate production-attraction matrices for all three trip types into one consolidated matrix that contained information for all types of trips (Home Based Work, Non Home Based and Home Based Other).
From this stage, we loaded our production-attraction matrix into TransCAD to convert the production-attraction matrix into a one-hour origin-destination matrix for the hour 5pm-6pm. We chose this time assuming that it would reveal information about peak congestion times.
The other things we needed for this step were information about the capacity of each link (road) in our network and free flow travel times. We calculated capacity by assuming that capacity is a function of the number of lanes and the speed of the road. We assumed that all roads that did not have lanes in the open street map dataset were 2 lane roads. We also assumed that the capacity for a given lane of road at 60 miles per hour is 1800 cars. To generate the capacity for each link we used the formula:
\[capacity = 1800 * number Of Lanes * (speed/60)\]
Once we had these three inputs, we were able to calculate vehicle volumes for each link in the network.
Figure 23: Road Congestion at 5pm in Oklahoma City. Orange and Red links show where the number of vehicles (v) exceeds the capacity (c) of the road
For our scenario, we modeled what would happen if Oklahoma ceded to the demands of the Land Back Movement and returned sovereignty and control of a considerable amount of land that was stolen from the Creek, Seminole, and Kickapoo communities in the seven land runs that occurred between 1889 and 19063 National Cowboy and Western Heritage Museum, Rushes to Statehood: The Oklahoma Land Runs (land was stolen from many other nations in Oklahoma but are outside our study area).
Figure 24: Assessment of Climate Risks for OKC through 2050 by ClimateCheck.com
We will test a scenario where after reclaiming sovereignty, native communities choose to build mixed-use with high-density affordable housing. This is inspired by the recent Senakw development in Vancouver where the Canadian government repatriated land to the Squamish nation which decided to build a district of dense affordable housing that was well outside of the previous zoning regulations. We will alter our model by dramatically increasing the population and employment numbers for TAZs that we identify as being returned to native communities. For the sake of this model, we decided to have this increase in population be made up of people moving into the study area from other parts of the country. This decision was informed in part by the observation that Oklahoma City is projected to have less severe drought than other regions of the Southwest due to climate change and could see an increase of domestic populations moving there as well as international climate refugees.
Figure 25: A map of the Land Run of 1895
This scenario is concerned with the area of Oklahoma City contained within the Land Run of 1895. We geolocated the map in Figure 25 to identify which census tracts would become part of the new urban center. In total, six tracts with a total area of 353 square miles were included.
Figure 26: Matching Maps of Land Seized to Census Tracts
Figure 27: The census tracts that would be returned to Kickapoo nation sovereignty
In order to model travel behavior under the new scenario, we needed to modify the demographics of the new urban area. We chose to reference the demographic make-up of a downtown census tract4 Census Tract 1019, Oklahoma County for seeding the new area, using the same ratios of household size, composition, and car ownership, while keeping the median income the same as what those tracts already were. We also kept the same ratio of population density to employment and activity density, while scaling the population to reach our desired total of ~250,000 new residents. The dot density map in Figure 28 visualizes these demographic changes.
Figure 28: Home Based Other productions and attractions in the Land Back Scenario.
The only additional data used for our scenario is a map outlining the land taken in the 1895 land run. See sidenote 3 above.
There were three metrics we wanted to compare between the base model and this scenario, changes in: Travel Flows, Total VMT, and Congestion.
To begin, we wanted to see the impact this scenario would have on the travel flows between counties. In our current model, the overwhelming majority of travel for all three trip types originates in Cleveland county and goes to Oklahoma County. Our hypothesis is that in this scenario there will be a more even distribution of trip origin and destination across the seven counties in our study area.
One of the things we wanted to understand with this scenario was how well the current road network infrastructure could handle a dramatic increase in population. To look at this we can look at the road link vehicle volumes (the output from the trip assignment phase of our model) to see how much link vehicle volumes exceed the link capacities. Since there are no public transit routes that reach our area of interest, we will only be looking at vehicles.
We also wanted to understand the impact this would have on total Vehicle Miles Traveled. In this scenario we would be essentially positing a situation where the OKC metro area is no longer organized around the central downtown (with the University campus as a secondary attractor), but instead a more polycentric distribution. This would make the region follow a logic closer to the Raleigh-Durham-Cary research triangle, Minneapolis-St. Paul or Dallas-Fort Worth. The VMT will go up with more population, but will it go up in direct proportion to population or would it be influenced by other factors such as centralized versus polycentric development?
We used our new zonal demographic data to rerun our travel behavior model and produce a forecast for the land back scenario.
Figure 29: Desire lines plotted between counties in Oklahoma City.
This new set of county-level travel flows shows more evenly distributed travel and the aforementioned poly-centrism, specifically by way of increasing travel to Lincoln County. By subtracting these flow levels from the original model we can see which county-to-county desire lines are now seeing more traffic:
Figure 30: Increases in travel flow between counties in Oklahoma City.
We can then take a look at which county-to-county desire lines are experiencing less traffic. We see the greatest reduction in travel to Oklahoma County.
Figure 31: Decreases in travel flow between counties in Oklahoma City
And we can also visualize this travel in more detail through chord diagrams:
Figure 32: Chord diagram for home-based work trips.
Figure 33: Chord diagram for home-based other trips.
Figure 34: Chord diagram for non-home-based trips.
Looking at our chord diagram, we can quantitatively see that there are now significantly more trips going to and from Lincoln County. What’s interesting to note is that this includes many other trips that are being diverted from other destinations, not just new trips. In our original model for Home Based Work trips there were 408,172 trips that left Cleveland County for Oklahoma County and only 7,892 trips that left Cleveland for Lincoln County. In our Land Back model, there are only 311,942 trips leaving from Cleveland to Oklahoma and 145,365 trips leaving Cleveland for Lincoln.
Figure 35: Road Congestion at 5pm in Oklahoma City under the proposed Land Back scenario. Orange and Red links show where the number of vehicles (v) exceeds the capacity (c) of the road
| Metric | Base Value | Land Back Scenario Value | % Change |
|---|---|---|---|
| Population | 1,425,695 | 1,689,291 | 18.4 |
| VMT | 22,006,966 | 40,856,275 | 85 |
In our original model the total population was 1.4 million. The Land Back scenario adds ~250,000 additional people to the metropolitan area, amounting to a 18.4% increase in population. In our original model, VMT was around 22 million, while in the new scenario it totals 40.8 million. This amounts to an 85% increase in VMT, clearly indicating a non-linear scaling relationship.
As a result of the increase in VMT, we would see much more congestion occurring under present infrastructure conditions. The distributions of congestion for the base model and for our scenario are shown in Figure 36. The dark purples shows where there is overlap. While both distributions peak around 1 (log VOC), this distribution shows a clear increase in links with very high levels of congestion.
Figure 36: Histogram of road segment congestion in new scenario vs current conditions
This scenario has tested an extreme case: an 18.4% increase in population with no additional road network or transportation infrastructure changes. But a decision maker looking at this could see information such as how traffic might respond to large increases in population (it would get more congested) and how the move towards a more polycentric metro area can impact the city center (potentially less trips to the original downtown).
Assuming some sort of land back was indeed implemented, a planner could use this model in different ways: they could decide to advocate for additional public transit in the study area, to propose expansion of the road network’s capacity, or more likely a mix of the two. This decision-making would also be dependent on the values of the city when it comes to climate change and access for zero-vehicle households.
This type of model can also inform a prioritization strategy. By assessing which roads will be used the most and areas will be most trafficked, strategic investments can be made to expand transit routes and road capacity in those locations initially.
If the population within this area had actually increased so drastically then census tract boundaries would have been redrawn, and new tracts would have been added. For our model, this would have meant many more TAZs and more accurate centroids of that new population. It was beyond the scope of this study to try and redraw census tract boundaries, so all the new population is unrealistically concentrated in the centers of the large census tracts.
This analysis and a four-step model generally cannot answer questions about how people’s decision-making with regards to travel might change under the new conditions. We also cannot forecast any land use changes that might occur in other parts of the metropolitan area as a result of the introduction of this new urban area.
For data about the population density, income, household size, and vehicle availability of we used 5-year Sample American Community Survey (ACS) Data from 2021. For information about the land use and employment we used Longitudinal Employer Household Dynamics (LEHD). For geographic boundaries we used census data.
To generate the Road Network we used data pulled from Open Street Map, downloaded through the service . We included in our road network all road segments labeled as motorways, motorway_links, secondary, tertiary, trunks or unclassified roads. We decided to include the unclassified roads when we realized that major roads including US-77, US-62 were not included in motorways. Adding in unclassified roads also brought back in “boulevards,” such as Oklahoma City Boulevard and North Lincoln Boulevard. Our assumption is that since the original data did not label any roads as “primary,” many roads that would have been considered primary were instead labeled as unclassified.We then began to generate a transit skim using the software Transcad.
In this model we used General Transit Feed Specification (GTFS) data pulled from Oklahoma City EMBARK’s GTFS feed.
All primary and secondary roads in rural areas are two-way roads even if coded as one-way roads in the OSM data. This assumption was based on cross-referencing against satellite images that indicated roads had bi-directional traffic despite being coded as one-ways in OSM. We identified rural areas by looking at the network and selecting areas that had large, mostly rectangular Transit Area Zones (TAZs). See Figure 3 for an image of primary or secondary road segments that we treated as rural two-ways.
We made the following speed assumptions: * Unclassified road speeds are 30 mph * Motorways are 60 mph * Primary are 60 mph * Secondary are 40 mph * Tertiary are 30 mph * Centroid Connectors are 15.
In our model we assumed that centroid connectors could model residential roads in each TAZ. Centroid connectors can be up to 25 miles long, but must connect to a road no more than .1 miles outside of the zone boundary. Each centroid can have up to 7 centroid connectors.
We are weighting the portion of time spent waiting for a bus or train as 2.5 times the in-vehicle travel time (IVTT)
We are using a logistic decay function with an inflection point of 25 and standard deviation of 5
Figure 37: Employment is concentrated in the downtown area. Employment information is not available for many of our Transit Analysis Zones
Figure 38: Majority of employment in OKC metro area is in the service industry.
Fig A3: Employment + Activity Density is greatest in downtown OKC
Figure 39: Vehicle Ownership Dot Density Map
Figure 40: Highest income neighborhoods are north of downtown.
Figure 41: Census tracts by income, population density, and # of adults Living with their parents.